skip to main content


Search for: All records

Creators/Authors contains: "Dasgupta, Aritra"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Algorithmic rankers are ubiquitously applied in automated decision systems such as hiring, admission, and loan-approval systems. Without appropriate explanations, decision-makers often cannot audit or trust algorithmic rankers' outcomes. In recent years, XAI (explainable AI) methods have focused on classification models, but for algorithmic rankers, we are yet to develop state-of-the-art explanation methods. Moreover, explanations are also sensitive to changes in data and ranker properties, and decision-makers need transparent model diagnostics for calibrating the degree and impact of ranker sensitivity. To fulfill these needs, we take a dual approach of: i) designing explanations by transforming Shapley values for the simple form of a ranker based on linear weighted summation and ii) designing a human-in-the-loop sensitivity analysis workflow by simulating data whose attributes follow user-specified statistical distributions and correlations. We leverage a visualization interface to validate the transformed Shapley values and draw inferences from them by leveraging multi-factorial simulations, including data distributions, ranker parameters, and rank ranges. 
    more » « less
    Free, publicly-accessible full text available June 18, 2024
  2. Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. By performing low-cost joins on multiple datasets with shared attributes, malicious users of open data portals might get access to information that violates individuals’ privacy. However, open data sets are primarily published using a release-and-forget model, whereby data owners and custodians have little to no cognizance of these privacy risks. We address this critical gap by developing a visual analytic solution that enables data defenders to gain awareness about the disclosure risks in local, joinable data neighborhoods. The solution is derived through a design study with data privacy researchers, where we initially play the role of a red team and engage in an ethical data hacking exercise based on privacy attack scenarios. We use this problem and domain characterization to develop a set of visual analytic interventions as a defense mechanism and realize them in PRIVEE, a visual risk inspection workflow that acts as a proactive monitor for data defenders. PRIVEE uses a combination of risk scores and associated interactive visualizations to let data defenders explore vulnerable joins and interpret risks at multiple levels of data granularity. We demonstrate how PRIVEE can help emulate the attack strategies and diagnose disclosure risks through two case studies with data privacy experts. 
    more » « less
  3. Modern social media platforms like Twitch, YouTube, etc., embody an open space for content creation and consumption. However, an unintended consequence of such content democratization is the proliferation of toxicity and abuse that content creators get subjected to. Commercial and volunteer content moderators play an indispensable role in identifying bad actors and minimizing the scale and degree of harmful content. Moderation tasks are often laborious, complex, and even if semi-automated, they involve high-consequence human decisions that affect the safety and popular perception of the platforms. In this paper, through an interdisciplinary collaboration among researchers from social science, human-computer interaction, and visualization, we present a systematic understanding of how visual analytics can help in human-in-the-loop content moderation. We contribute a characterization of the data-driven problems and needs for proactive moderation and present a mapping between the needs and visual analytic tasks through a task abstraction framework. We discuss how the task abstraction framework can be used for transparent moderation, design interventions for moderators’ well-being, and ultimately, for creating futuristic human-machine interfaces for data-driven content moderation. 
    more » « less
  4. null (Ed.)
    Despite the widespread use of communicative charts as a medium for scientific communication, we lack a systematic understanding of how well the charts fulfill the goals of effective visual communication. Existing research mostly focuses on the means, i.e. the encoding principles, and not the end, i.e. the key takeaway of a chart. To address this gap, we start from the first principles and aim to answer the fundamental question: how can we describe the message of a scientific chart? We contribute a fact-evidence reasoning framework (FaEvR) by augmenting the conventional visualization pipeline with the stages of gathering and associating evidence for decoding the facts presented in a chart. We apply the resulting classification scheme of fact and evidence on a collection of 500 charts collected from publications in multiple science domains. We demonstrate the practical applications of FaEvR in calibrating task complexity and detecting barriers towards chart interpretability. 
    more » « less
  5. Geographical maps encoded with rainbow color scales are widely used for spatial data analysis in climate science, despite evidence from the visualization literature that they are not perceptually optimal. We present a controlled user study that compares the effect of color scales on performance accuracy for climate-modeling tasks using pairs of continuous geographical maps generated using climatological metrics. For each pair of maps, 39 scientist-observers judged: i) the magnitude of their difference, ii) their degree of spatial similarity, and iii) the region of greatest dissimilarity between them. Besides the rainbow color scale, two other continuous color scales were chosen such that all three of them covaried two dimensions (luminance monotonicity and hue banding), hypothesized to have an impact on visual performance. We also analyzed subjective performance measures, such as user confidence, perceived accuracy, preference, and familiarity in using the different color scales. We found that monotonic luminance scales produced significantly more accurate judgments of magnitude difference but were not superior in spatial comparison tasks, and that hue banding had differential effects based on the task and conditions. Scientists expressed the highest preference and perceived confidence and accuracy with the rainbow, despite its poor performance on the magnitude comparison tasks. 
    more » « less